35 research outputs found
PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT
This study provides an efficient approach for using text data to calculate
patent-to-patent (p2p) technological similarity, and presents a hybrid
framework for leveraging the resulting p2p similarity for applications such as
semantic search and automated patent classification. We create embeddings using
Sentence-BERT (SBERT) based on patent claims. We leverage SBERTs efficiency in
creating embedding distance measures to map p2p similarity in large sets of
patent data. We deploy our framework for classification with a simple Nearest
Neighbors (KNN) model that predicts Cooperative Patent Classification (CPC) of
a patent based on the class assignment of the K patents with the highest p2p
similarity. We thereby validate that the p2p similarity captures their
technological features in terms of CPC overlap, and at the same demonstrate the
usefulness of this approach for automatic patent classification based on text
data. Furthermore, the presented classification framework is simple and the
results easy to interpret and evaluate by end-users. In the out-of-sample model
validation, we are able to perform a multi-label prediction of all assigned CPC
classes on the subclass (663) level on 1,492,294 patents with an accuracy of
54% and F1 score > 66%, which suggests that our model outperforms the current
state-of-the-art in text-based multi-label and multi-class patent
classification. We furthermore discuss the applicability of the presented
framework for semantic IP search, patent landscaping, and technology
intelligence. We finally point towards a future research agenda for leveraging
multi-source patent embeddings, their appropriateness across applications, as
well as to improve and validate patent embeddings by creating domain-expert
curated Semantic Textual Similarity (STS) benchmark datasets.Comment: 18 pages, 7 figures and 4 Table
A Vector Worth a Thousand Counts:A Temporal Semantic Similarity Approach to Patent Impact Prediction
The Privatization of AI Research(-ers): Causes and Potential Consequences -- From university-industry interaction to public research brain-drain?
The private sector is playing an increasingly important role in basic
Artificial Intelligence (AI) R&D. This phenomenon, which is reflected in the
perception of a brain drain of researchers from academia to industry, is
raising concerns about a privatisation of AI research which could constrain its
societal benefits. We contribute to the evidence base by quantifying transition
flows between industry and academia and studying its drivers and potential
consequences. We find a growing net flow of researchers from academia to
industry, particularly from elite institutions into technology companies such
as Google, Microsoft and Facebook. Our survival regression analysis reveals
that researchers working in the field of deep learning as well as those with
higher average impact are more likely to transition into industry. A
difference-in-differences analysis of the effect of switching into industry on
a researcher's influence proxied by citations indicates that an initial
increase in impact declines as researchers spend more time in industry. This
points at a privatisation of AI knowledge compared to a counterfactual where
those high-impact researchers had remained in academia. Our findings highlight
the importance of strengthening the public AI research sphere in order to
ensure that the future of this powerful technology is not dominated by private
interests